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[57] ABSTRACT 

A method for performing image compression that elimi- 
nates redundant and invisible image components. The 
image compression uses a Discrete Cosine Transform 
(DCT) and each DCT coefficient yielded by the trans- 
form is quantized by an entry in a quantization matrix 
which determines the perceived image quality and the 
bit rate of the image being compressed. The present 
invention adapts or customizes the quantization matrix 
to the image being compressed. The quantization matrix 
comprises visual masking by luminance and contrast 
techniques and by an error pooling technique all result- 
ing in a minimum perceptual error for any given bit 
rate, or minimum bit rate for a given perceptual error. 

10 Claims, 6 Drawing Sheets 
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IMAGE DATA COMPRESSION HAVING 
MINIMUM PERCEPTUAL ERROR 

ORIGIN OF THE DISCLOSURE 

The invention described herein was made by an em- 
ployee of the National Aeronautics and Space Adminis- 
tration and it may be manufactured and used by and for 
the United States Government for governmental pur- 
poses without the payment of royalties thereon or there- 
fore. 

BACKGROUND OF THE INVENTION 

A. Technical Field of Field of the Invention: 

The present invention relates to an apparatus and 
method for coding images, and more particularly, to an 
apparatus and method for compressing images to a re- 
duced number of bits by employing a Discrete Cosine 
Transform (DCT) in combination with visual masking 
including luminance and contrast techniques as well as 
error pooling techniques all to yield a quantization ma- 
trix optimizer that provides an image having a minimum 
perceptual error for a given bit rate, or a minimum bit 
rate for a given perceptual error. 

B. Description of the Prior Art: 

Considerable research has been conducted in the field 
of data compression, especially the compression of digi- 
tal information of digital images. Digital images com- 
prise a rapidly growing segment of the digital informa- 
tion stored and communicated by science, commerce, 
industry and government. Digital images transmission 
has gained significant importance in highly advanced 
television systems, such as high definition television 
using digital information. Because a relatively large 
number of digital bits are required to represent digital 
images, a difficult burden is placed on the infrastructure 
of the computer communication networks involved 
with the creation, transmission and re-creation of digital 
images. For this reason, there is a need to compress 
digital images to a smaller number of bits, by reducing 
redundancy and invisible image components of the im- 
ages themselves. 

A system that performs image compression is dis- 
closed in U.S. Pat. No. 5,121,216 of C.E. Chen et al, 
issued Jun. 9, 1992, and herein incorporated by refer- 
ence. The ’216 patent discloses a transform coding algo- 
rithm for a still image, wherein the image is divided into 
small blocks of pixels. For example, each block of pixels 
may be either an 8x8 or 16 X 16 block. Each block of 
pixels then undergoes a two dimensional transform to 
produce a two dimensional array of transform coeffici- 
ents. For still image coding applications, a Discrete 
Cosine Transform (DCT) is utilized to provide the or- 
thogonal transform. 

In addition to the ’216 patent, the Discreet Cosine 
Transform is also employed in a number of current and 
future international standards, concerned with digital 
image compression, commonly referred to as JPEG and 
MPEG, which are acronyms for Joint Photographic 
Experts Group and Movie Pictures Experts Group, 
respectively. After a block of pixels of the ’216 patent 
undergoes a Discrete Cosine Transform (DCT), the 
resulting transform coefficients are subject to compres- 
sion by thresholding and quantization operations. Thre- 
sholding involves setting all coefficients whose magni- 
tude is smaller than a threshold value equal to zero. 
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whereas quantization involves scaling a coefficient by 
step size and rounding off to the nearest integer. 

Commonly, the quantization of each DCT coefficient 
is determined by an entry in a quantization matrix. It is 
this matrix that is primarily responsible for the per- 
ceived image quality and the bit rate of the transmission 
of the image. The perceived image quality is important 
because the human visual system can tolerate a certain 
amount of degradation of an image without being 
alerted to a noticeable error. Therefore, certain images 
can be transmitted at a low bit rate, whereas other im- 
ages cannot tolerate any degradation and should be 
transmitted at a higher bit rate in order to preserve their 
informational content. 

The ’216 patent discloses a method for the compres- 
sion of image information based on human visual sensi- 
tivity to quantization errors. In the method of ’216 pa- 
tent, there is a quantization characteristic associated 
with block to block components of an image. This quan- 
tization characteristic is based on a busyness measure- 
ment of the image. The method of ’216 patent does not 
compute a complete quantization matrix, but rather 
only a single scaler quantizer. 

Two other methods are available for computing 
DCT quantization matrices based on human sensitivity. 
One is based on a mathematical formula for human 
contrast sensitivity function, scaled for viewing dis- 
tance and display resolution, and is disclosed in U.S. 
Pat No. 4,780,716 of S.J. Daly et al. The second is 
based on a formula for the visibility of individual DCT 
basic functions, as a function of viewing distance, dis- 
play resolution, and display luminance. The second 
formula is disclosed in both a first article entitled 
“Luminance-Model-Based DCT Quantization For 
Color Image Compression” of A.J. Ahumada et al. 
published in 1992 in the Human Vision, Visual Process- 
ing, and Digital Display III Proc. SPIE 1666, Paper 32, 
and a second technical article entitled “An Improved 
Detection Model for DCT Coefficient Quantization” of 
H.A. Peterson, et al., published in 1993, in Human Vi- 
sion, Visual Processing and Digital Display VI Proc. 
SPIE. Vol. 1913 pages 191-201. The methods described 
in the ’761 patent and the two technical articles do not 
adapt the quantization matrix to the image being com- 
pressed, and do not therefore take advantage of masking 
techniques for quantization errors that utilize the image 
itself. Each of these techniques has features and benefits 
described below. 

First, visual thresholds increase with background 
luminance and this feature should be advantageously 
utilized. However, the formula given in the both refer- 
enced technical articles describes the threshold for 
DCT basic functions as a function of mean luminance. 
This would normally be taken as the mean luminance of 
the display. However, variations in local mean lumi- 
nance within the image will in fact produce substantial 
variations in the DCT threshold quantities. These varia- 
tions are referred to herein as “luminance masking” and 
should be fully taken into account. 

Second, threshold for a visual pattern is typically 
reduced in the presence of other patterns, particularly 
those of similar spatial frequency and orientation. This 
reduction phenomenon is usually called “contrast mask- 
ing.” This means that a threshold error in a particular 
DCT coefficient in a particular block of the image will 
be a function of the value of that coefficient in the origi- 
nal image. The knowledge of this function should be 
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taken advantage of in order to compress the image inverse quantizing up by multiplying by q,y; (iv) sub- 
while not reducing the quality of the compressed image. tracting the reconstructed coefficient qtjUgk from cp to 

Third, the method disclosed in the two referenced compute the quantization error ep, (v) dividing epby 

technical articles ensures that a single error is below a the DCT mask mp to obtain perceptual errors; (vi) 


predetermined threshold. However, in a typical image 5 
there are many errors of varying magnitudes that are 
not properly handled by a single threshold quantity. 

The visibility of this error ensemble selected to handle 
all varying magnitudes is not generally equal to the 
visibility of the largest error, but rather reflects a pool- 10 
ing of errors over both frequencies and blocks of the 
image. This pooling is herein term “error pooling” and 
is beneficial in compressing the digital information of 
the image while not degrading the quality of the image. 

Fourth, when all errors are kept below a perceptual 15 
threshold, a certain bit rate will result, but at times it 
may be desired to have an even lower bit rate. The two 
referenced technical articles do not disclose any method 
that would yield a minimum perceptual error for a 
given bit rate, or a minimum bit rate for a given percep- 20 
tual error. It is desired that such a method be provided 
to accommodate this need. 

Finally, it is desired that all of the above prior art 
limitations and drawbacks be eliminated so that a digital 
image may be represented by a reduced number of 25 
digital bits while at the same time providing an image 
having a low perceptual error. 

Accordingly, an object of the present invention is to 
provide a method to compress digital information yet 
provide a visually optimized image. 30 

Another object of the present invention is to provide 
a method of compressing a visual image based on lumi- 
nance masking, contrast masking, and error pooling 
techniques. 

A further object of the present invention is to provide 35 
a quantization matrix that is adapted to the individual 
image being compressed so that the image that is repro- 
duced has a high resolution and a low perceptual error. 

A still further object of the present invention is to 
provide a method that yields minimal perceptual error 40 
of an image for a given bit rate, or a minimum bit rate 
for a given perceptual error of the image. 

SUMMARY OF THE INVENTION 

The invention is directed to digital compression of 45 
images, comprising a plurality of blocks of pixels, that 
uses the DCT transform coefficients yielded from a 
Discrete Cosine Transform (DCT) of all the blocks as 
well as other display and perceptual parameters all to 
generate a quantization matrix which, in turn, yields a 50 
reproduced image having a low perceptual error. The 
invention adapts or customizes the individual quantiza- 
tion matrix to the image being compressed. 

The present invention transforms a block of pixels 
from an electronic image into a digital representation of 55 
that image and comprises the steps of applying a Dis- 
crete Cosine Transform (DCT), selecting a DCT mask 
(mijk) for each block of pixels, and selecting a quantiza- 
tion matrix (qy) for quantizing DCT transformation 
coefficients (cyk) produced by the DCT transformation. 60 
The application of a Discrete Cosine Transform (DCT) 
transforms the block of pixels into a digital signal repre- 
sented by the DCT coefficients (cyjt). The DCT mask is 
based on parameters comprising DCT coefficients (cp), 
and display parameters. The selection of the quantiza- 65 
tion matrix (q,y) comprises the steps of: (i) selecting an 
initial value of q $ (ii) quantizing the DCT coefficient q ; y 
in each block k to form quantized coefficient up; (iii) 


pooling the perceptual errors of one frequency //over all 
blocks k to obtain an entry in a perceptual error matrix 
p if 9 and (vii) repeating this process (i-vi) for each fre- 
quency ip and (viii) adjusting the values of qy up or 
down until each entry in the perceptual error matrix Py 
is within a target range. 

The method preferably comprises a further step of 
entropy coding the digital representation of the image. 
In addition, the invention further comprises providing a 
computer network for implementing the practice of the 
present invention. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a block diagram of a computer network that 
may be used in the practice of the present invention. 

FIG. 2 schematically illustrates some of the steps 
involved with the method of the present invention. 

FIG. 3 schematically illustrates the steps involved, in 
one embodiment, with the formation of the quantization 
matrix optimizer of the present invention. 

FIG. 4 schematically illustrates the steps involved, in 
another embodiment, with the formation of the quanti- 
zation matrix optimizer of the present invention. 

FIG. 5 illustrates a series of plots showing the varia- 
tions of the luminance patterns of a digital image. 

FIG. 6 is a plot of the contrast masking function 
involved in the practice of the present invention. 

FIG. 7 illustrates two plots each of a different digital 
image and each showing the relationship between the 
perceptual error and the bit rate involved in image 
compression. 

DETAILED DESCRIPTION OF THE 
INVENTION 

Referring now to the drawings wherein like refer- 
ence numerals designate like elements, there is shown in 
FIG. 1 a block diagram of a computer network 10 that 
may be used in the practice or the present invention. 
The network 10 is particularly suited for performing the 
method of the present invention related to a still image 
that may be stored, retrieved or transmitted. For the 
embodiment shown in FIG. 1, a first group of channel- 
ized equipment 12 and a second group of channelized 
equipment 14 are provided. Further, as will be further 
described, for the embodiment shown in FIG. 1, the 
channelized equipment 12 is used to perform the storage 
mode 16/retrieval mode 18 operations of the network 
10 and, similarly, the channelized equipment 14 is used 
to perform the storage mode 16/retrieval mode 18 oper- 
ations of the network 10. As will be further described, 
the storage mode 16 is shown as accessing each disk 
subsystem 20, whereas the retrieval mode 18 is shown as 
recovering information from each disk subsystem 20. 
Each of the channelized equipments 12 and 14 may be a 
SUN SPARC computer station whose operation is dis- 
closed in instruction manual Sun Microsystems Part 
#800-5701-10. Each of the channelized equipments 12 
and 14 is comprised of elements having the reference 
numbers given in Table 1. 

TABLE 1 

Refere nce No. Element 

20 Disk Subsystem 

22 Communication Channel 
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TABLE 1 -continued 


Reference No. 

Element 

24 

CPU Processor 

26 

Random Access Memory (RAM) 

28 

Display Subsystem 


In general, and as to be more fully described, the 
method of the present invention, being run in the net- 
work 10 , utilizes, in part, a Discrete Cosine Transform 
(DCT), discussed in the “Background” section, to ac- 
complish image compression. In the storage mode 16 , 
an original image 30 , represented by a plurality of digi- 
tal bits, is received from a scanner or other source at the 
communication channel 22 of the channelized equip- J5 
ment 12 . The image 30 is treated as a digital file contain- 
ing pixel data. The channelized equipment 12 , in partic- 
ular the CPU processor 24 , performs a DCT transfor- 
mation, computes a DCT mask and iteratively estimates 
a quantization matrix optimizer. The channelized equip- 2Q 
ment 12 then quantizes the digital bits comprising the 2 
image 30 , and performs run-length encoding and Huff- 
man or arithmetic coding of the quantized DCT coeffi- 
cients. Run-length encoding, arithmetic coding and 
Huffman coding are well-known and reference may be 
respectively made to the discussion of reference num- 
bers 24 and 28 of U.S. Pat. No. 5,170,264, herein incor- 
porated by reference, for a further discussion thereof. 
The optimized quantization matrix is then stored in 
coded form along with coded coefficient data, follow- 
ing a JPEG or other standard. The compressed file is 30 
then stored on the disk subsystem 20 of the channelized 
equipment 12. 

In the retrieval mode 18 , the channelized equipment 
12 (or 14 ) retrieves the compressed file from the disk 
subsystem 20, and decodes the quantization matrix and 35 
the DCT coefficient data. The channelized equipment 
12 , or the channelized equipment 14 to be described, 
then de-quantizes the coefficients by multiplication of 
the quantization matrix and performs an inverse DCT. 
The resulting digital file containing pixel data is avail- 40 
able for display on the display subsystem 28 of the chan- 
nelized equipment 12 or can be transmitted to the chan- 
nelized equipment 14 or elsewhere by the communica- 
tion channel 22 . The resulting digital file is shown in 
FIG. 1 as 30 ' (IMAGE). The operation of the present 45 
invention may be further described with reference to 
FIG. 2. 

FIG. 2 is primarily segmented to illustrate the storage 
mode 16 and the retrieval mode 18 . FIG. 2 illustrates 
that the storage mode 16 is accomplished in channelized 50 
equipment, such as channelized equipment 12, and the 
retrieval mode is accomplished in the same or another 
channelized equipment, such as channelized equipment 
14 . The channelized equipments 12 and 14 are inter- 
faced to each other by the communication channel 22 . 55 
The image 30 being compressed by the operation of the 
present invention comprises a two-dimensional array of 
pixels, e.g., 256x256 pixels. This array of pixels is com- 
posed of contiguous blocks; e.g., 8X8 blocks, of pixels 
representatively shown in segment 32 . The storage 60 
mode 16 is segmented into the following steps: block 32 , 
DCT 34 , initial matrix, quantization matrix optimizer 
36 , quantize 38 , and entropy code 40 . The retrieval 
mode 18 is segmented into the following steps: entropy 
decode 42 , de-quantize 44 , inverse DCT 46 , and un- 65 
block 48 . The steps shown in FIG. 2 (to be further 
discussed with reference to FIG. 3 ) are associated with 
the image compression of the present invention and, in 
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order to more clearly describe such compression, refer- 
ence is first made to the quantities listed in the Table 2 
having a general definition given therein. 


TABLE 2 


Quantities 

General Definition 

j 

indexes of the DCT frequency (or basis 
function) 

k 

index of a block of the image 

Cj ijk 

DCT coefficients of an image 


quantization matrix 

U/j* 

quantized DCT coefficients 

e/y* 

DCT error 

% 

DCT threshold matrix (based on global 
mean luminance) 

apw [i, j, L, px, py .. 

.. ] threshold formula of Peterson et al. given 
in the article Human Vision, Visual 
Processing and Digital Display VI (pre- 
viously cited) 

tijk 

DCT threshold matrix (based on local 
mean luminance coo*) 

a t 

luminance masking exponent 


contrast masking exponent (Weber 
exponent) 

™ijk 

DCT Mask 

dijk 

perceptual error in a particular frequency 
i, j and block k 

P U 

perceptual error matrix 

fis 

spatial error-pooling 

Cook 

DC coefficient in block k 

Lo 

mean luminance of the display 

coo 

Average DC coefficient, corresponding to 
Lo (typically 1024) 

* 

target total perceptual error value 


Each block (step 32 ) of the pixels is subjected to the 
application of a Discrete Cosine Transform (DCT) (step 
34 ) yielding related DCT coefficients. The two-dimen- 
sional Discrete Cosine Transform (DCT) is well known 
and may be such as described in the previously incorpo- 
rated by reference U.S. Pat. No. 5,121,216. The coeffici- 
ents of the DCT, herein termed c j/*, obtained by the 
Discrete Cosine Transform (DCT) of each block of 
pixels comprise DC and AC components. The DC coef- 
ficient is herein termed coo* (0,0) which represents the 
average intensity of the block. The remainder of the 
coefficients Cy* are termed AC coefficients ( 0 , 1 ), ( 1 , 0 ) . . 

• (ij)- 

The DCT (step 34 ) of all blocks (step 32 ), along with 
the display and perceptual parameters (to be described) 
and an initial matrix, are all inputted into a quantization 
matrix optimizer 36 , which is a process that creates an 
optimized quantization matrix which is used to quantize 
(step 38 ) the DCT coefficients. The optimized quantiza- 
tion matrix is also transferred, by the communication 
channel 22 of the channelized equipment 12, for its use 
in the retrieval mode 18 that is accomplished in the 
channelized equipment 14 . The quantized DCT coeffi- 
cients (c ijk) are entropy coded (step 40 ) and then sent to 
the communication channel 22 . Entropy coding is well- 
known in the communication art and is a technique 
wherein the amount of information in a message is based 
on logn, where n is the number of possible equivalent 
messages contained in such information. 

At the receiving channelized equipment 14 , an in- 
verse process occurs to reconstruct the original block of 
pixels thus, the received bit stream of digital informa- 
tion containing quantized DCT coefficients c y* is en- 
tropy decoded (step 42 ) and then are de-quantized (step 
44 ), such as by multiplying by the quantization step size 
q ij to be described. An inverse transform, such as an 
inverse Discrete Cosine Transform (DCT), is then ap- 
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then applied to the processing segment 62. The process- 
ing segment 62 causes all the scaled errors to be pooled 
over all of the blocks, separately for each DCT fre- 
quency (ij). The term “error pooling” is meant to repre- 
sent that the errors are combined over all of the DCT 5 
coefficients rather than having one relatively large 
error in one DCT coefficient dominating the other 
errors in the remaining DCT coefficients. The pooling 
is accomplish by a routine having a relationship of ex- 
pression 7: 10 



Where dy*is an error in a particular frequency i,j, and 
block k, /3s is a pooling exponent having a typical value 
of 4. It is allowed that the routine of expression (7) 
provide a matrix of exponents /3s since the pooling of 
errors may vary for different DCT coefficients. 20 

The matrix p ,y of expression (7) is the “perceptual 
error matrix” and is a simple measure of the visibility of 
artifacts within each of the frequency bands defined by 
the DCT basic functions. More particularly, the percep- 
tual error matrix is a good indication of whether or not 25 
the human eye can perceive a dilution of the image that 
is being compressed. The perceptual error p ,y matrix 
developed by segments 56, 58, 60 and, finally, segment 
62 is applied to processing segment 64. 

In processing segment 64, each element of the percep- 30 
tual error matrix p,yis compared to a target error param- 
eter T, which specifies a global perceptual quality of the 
compressed image. This global quality is somewhat like 
the entries in the perceptual error matrix and again is a 
good indication of the amount of degradation that the 35 
compressed image may suffer without being perceived 
by the human eye. If all quantities or errors generated 
by segment 62 and entered into segment 64 are within a 
delta of or if the errors of segment 62 are less than the 

target error parameter T and the corresponding quanti- 40 
zation matrix entry is at a maximum (processing seg- 
ment 56), the search is terminated and the current ele- 
ment of quantization matrix is outputted to comprise an 
element of the final quantization matrix 78. Otherwise, 
if the element of the perceptual error matrix is less than 45 
the target parameter T, the corresponding entry (seg- 
ment 56) of the quantization matrix is incremented. 
Conversely, if the element of the perceptual error ma- 
trix is greater than the target parameter T, the corre- 
sponding entry (segment 56) of the quantization matrix 50 
is decremented. The incrementing and decrementing is 
accomplished by processing segment 66. 

A bisection method, performed in segment 66, is 
typically used to determine whether to increment or 
decrement the initial matrix 35 entered into step 56. In 55 
the bisection method a range is established for q,y be- 
tween lower and upper bounds, typically 1 and 255 
respectively. The perceptual error matrix p,y is evalu- 
ated at the mid-point of the range. If p,yis greater than 
the target error parameter T, then the lower bound is 60 
reset to the mid-point, otherwise the upper bound is 
reset to the mid-point. This procedure is repeated until 
the mid-point no longer changes. As a practical matter, 
since the quantization matrix entries q,y in the baseline 
JPEG standard are eight bit integers, the needed degree 65 
of accuracy is normally obtained in nine iterations from 
a starting range of 1-255 (initial entry into segment 56). 
The output of the program segment 66 is applied to the 
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quantize segment 56 and then steps 56-66 are repeated, 
if necessary, for the remaining elements in the initial 
matrix. The processing segments shown in FIG. 3 yield 
a compressed image with a resulting bit rate; however, 
if a particular bit rate is desired for the image, then the 
processing segments shown in FIG. 4 and given in the 
below Table 5 are applicable. 

TABLE 5 


Processing Segment 

Nomenclature 

80 

Select desired bit rate 

82 

Set initial target perceptual error 

84 

Optimize quantization matrix 
(56, 58, 60, 62, 64 and 66) 

86 

Quantize 

88 

Entropy code 

90 

Decision box (Is the bit rate ~ desired bit 
rate) 

92 

Adjust target perceptual error 


The processing segments 86-92, shown in FIG. 4 and 
given in Table 5, allow for the attainment of a particular 
bit rate and utilizes a second, higher-order optimization 
which if the first optimization results in a bit rate which 
is greater than desired, the value of the target percep- 
tual error parameter V of segment 92 is incremented. 
Conversely, if a bit rate results which is lower than 
desired, the value of the target perceptual error parame- 
ter T of segment 92 is decremented. 

The sequence of FIG. 4 starts with the selection (seg- 
ment 80) of the desired bit rate followed by the setting 
(segment 82) of the initial target perceptual error. The 
output of segment 82, as well as the output of segment 
92, is applied to segment 84 which comprises segments 
56, 58, 60, 62, 64 and 66, all previously described with 
reference to FIG. 3 and all of which contribute to pro- 
vide an optimized quantization matrix in a maimer also 
described with reference to FIG. 3. The output of seg- 
ment 84 is applied to the quantize segment 86 which 
operates in a similar manner as described for the quan- 
tize segment 38 of FIG. 2. The output of segment 86 is 
applied to the entropy code segment 88 which operates 
in a similar manner as described for the entropy code 40 
of FIG. 2. 

To accomplish the adjustment of the bit rate, the 
output processing segment 88 is applied to a decision 
segment 90 in which the actual bit rate is compared 
against the desired bit rate and, the result of such com- 
parison, determines the described incrementing or dec- 
rementing of the target perceptual error parameter T. 
After such incrementing or decrementing the process- 
ing steps 84-90 is repeated until the actual bit rate is 
equal to the desired bit rate, and the final quantization 
matrix 78 is created. 

It should now be appreciated that the practice of the 
present invention provides for a quantization matrix 78 
that yields minimum perceptual error for a given bit or 
a minimum bit rate for a given perceptual error. The 
present invention, as already discussed, provides for 
visual masking by luminance and contrast techniques as 
well as by error pooling. The luminance masking fea- 
ture may be further described with reference to FIG. 5. 

FIG. 5 has a Y axis given in the log function of t,yof 
expression (1) and a X axis given in display luminance L 
measured in cd/m 2 . The quantity ty of the block of data 
shown in FIG. 5 is based on a maximum display lumi- 
nance L of 100 cd/m 2 and a grey scale (reference scale 
for use in black-and-white television, consisting of sev- 



5 , 426,512 


11 

eral defined levels of brightness with neutral color) 
resolution of eight (8) bits. 

Two families of curves are shown in FIG. 5 with one 
being 94A, 94B, 94C, 94D and 94E (shown in solid 
representations), and the other being 96A, 96B, 96C, 5 
96D and 96E (shown in phantom representations). The 
families 94A..94E, and 96A..96E are plots for the DCT 
coefficient frequencies given Table 6. 


TABLE 6 


Plots 

Frequency 

94A and 96A 

7,7 

94B and 96B 

0,7 

94C and 96C 

0,0 

94D and 96D 

0,3 

94E and 96E 

0 , 1 


Detection threshold (t x y) for a luminance pattern typi- 
cally depends upon mean luminance from the local 
image region. More particularly, the higher the back- 
ground of the image being displayed, the higher, the 20 
luminance threshold. This is usually called “light adap- 
tation” but it is called herein “luminance masking.” 

FIG. 5 illustrates this effect whereby higher back- 
ground luminance yields higher luminance thresholds, 
wherein the solid plots 94A . . . 94E indicate this inter- 25 
dependency. In particular, it is seen that the value of t,y 
for each of the plots 94A . . . 94E increases with increas- 
ing values of luminance. The plots 94A . . . 94E illus- 
trate that as much 0.5 log units in t,y might be expected 
to occur within an image, due to variations in the mean 30 
luminance of a block. The present invention takes this 
variation into account, whereas known prior art tech- 
niques fail to consider this wide variation. 

The effect of mean luminance upon DCT coefficients 
of the quantization matrix q z y is complex, involving both 35 
vertical and horizontal shifts of the contrast sensitivity 
function. The luminance-masked threshold may be de- 
termined by equation 8: 

Ujk = ap w[i J,I^coOyt/coo] 40 

where coo*is the DC coefficient of DCT for block k, Lo 
is the mean luminance of the display, and coo is the DC 
coefficient corresponding to Lo (1024 for an eight (8) bit 
image). The solution is as complete and accurate as the 45 
underlying formula, but may be rather expensive to 
compute. For example, in the “Mathematica” language, 
using a compiled function, and running on channelized 
equipment 14 (SUN SPARC 2) of FIG. 1, took about 1 
second per block to compute this function. A second, 50 
simpler solution is to approximate the dependence of t z y 
upon coo* with the power function of the equation (2) 
previously given. 

In practice, the initial calculation of t z y should be made 
assuming a selected displayed luminance Lo- The pa- 55 
rameter a f has a typical value of 0.649. It should be 
noted, that luminance masking may be suppressed by 
setting a. t equal to 0. More generally, a t controls the 
degree to which the masking of FIG. 4 occurs. It should 
be further noted that the power function given in ex- 60 
pression 2 makes it easy to incorporate a non-unity 
display Gamma, by multiplying a/ by the Gamma expo- 
nent having a typical value of 2.3. 

The family of plots 96A . . . 96E of FIG. 5 vary in 
accordance with the relationship of expression 2 and are 65 
relatively accurate for the parameters above about 10 
cd/m 2 . Except for very dark sections of an image, this 
range should be more than adequate for most image 
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compressions. The discrepancy or inaccuracy of plots 
96A . . . 96E is also greatest at lowest frequency, espe- 
cially at the DC term (0,0) coo*. This discrepancy could 
be corrected by adopting a matrix of exponents, one for 
each frequency for the relationship given in expression 
( 2 ). 

It should now be appreciated that the practice of the 
present invention provides luminance masking (shown 
in FIG. 5 and performed in processing segment 52 of 
FIG. 3) which allows for an improved quality of the 
compressed image so that it may be more clearly repro- 
duced and seen the human eye. 

As previously discussed with reference to processing 
segment 54 of FIG. 3, the present invention also pro- 
vides for contrast masking. Contrast masking refers to 
the reduction in the visibility of one image component 
by the presence of another. This masking is strongest 
when both components are of the same spatial fre- 
quency, orientation, and location within the digital 
image being compressed. Contrast masking is achieved 
in the present invention by expression (3) as previously 
described. The benefits of the contrast masking function 
is illustrated in FIG. 6. 

FIG. 6 has a Y axis given in the quantity m ,y* (DCT 
mask) and a X axis given in the quantity c z y* (DCT 
coefficient) and illustrates a response plot 98 of the 
DCT mask my* as a function of the DCT coefficient 
Gy* for the parameter wy—0.7 and t*y*=2. Because the 
effect of the DC coefficient c <*,* upon the luminance 
masking (see FIG. 5) has already been expressed, the 
plot 98 does not include an effect of the DC coefficient 
coo* and accomplishes such by setting the value of woo 
equal to 0. From FIG. 6 it may be seen that the DCT 
mask (m ; y*) linearly increases from c ,y* quantities of 
between about 2 to 10. This DCT mask (m ; y*) generated 
by processing segment 54 adjusts each block for compo- 
nent contrast and is used in processing segment 60 to 
scale (divide) the quantization error with both functions 
of the DCT mask ensuring good digital compression, 
while still providing an image having good visual as- 
pects. 

It should be appreciated that the practice of the in- 
vention provides contrast masking so as to provide for 
a high quality visual representation of compressed digi- 
tal images as compared to other prior art techniques. 

The overall operation of the present invention is 
essentially illustrated in FIG. 7. FIG. 7 has a Y axis 
given in bits/pixel of the digital image and a X axis 
given in perceptual error. FIG. 7 illustrates two plots 
100 and 102 for two different images that were com- 
pressed and reconstituted in accordance with the here- 
inbefore described principles of the present invention. 
From FIG. 7 it is seen that the increasing bits/pixel rate 
causes a decrease in the perception error. 

The previously given description described herein 
yields a desired quantization matrix q*y with a specified 
perceptual error *P. However, if desired one may have a 
quantization matrix q z y that uses a given bit rate h^ with 
a minimum perceptual error *P. This can be done itera- 
tively by noting that the bit rate is a decreasing function 
of the perceptual error *P, as shown in FIG. 7. In the 
practice of our present invention a second order inter- 
polating polynomial fit to all previous estimated values 
of {h, V P} to estimate a next candidate 'P, terminating 
when [ h-ho | <Ah, where Ah is the desired accuracy in 
bit rate. On each iteration a complete estimation of is 
performed. 
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It should now be appreciated that the practice of the 
present invention provides a perceptual error that in- 
corporates visual masking by luminance and contrast 
techniques, and error pooling to estimate the matrix that 
has a minimum perceptual error for a given bit rate, or 
minimal bit rate for a given perceptual error. All told 
the present invention provides a digital compression 
technique that is useful in the transmission and repro- 
duction of images particularly those found in high defi- 
nition television applications. 

Further, although the invention has been described 
relative to a specific embodiment thereof, it is not so 
limited and many modifications and variations thereof 
now will be readily apparent for those skilled in the art 
in light of the above teachings. 

What I claim is: 

1. The method for transforming a block having an 
index k of pixels from an electronic image into a digital 
representation of said image comprising the steps of: 

(a) applying a Discrete Cosine Transform (DCT) to 
transform said block of pixels into digital signals 
represented by DCT transformation coefficients 
Qjk)\ 

(b) selecting a DCT mask (my*) for each block of 
pixels based on parameters comprising said DCT 
transformation coefficients Cy* and display and 
perceptual parameters; 

(c) selecting quantization matrix (gy) comprising the 
steps of: 

(i) selecting an initial value of qy having a DCT 
coefficient Cy*; 

(ii) quantizing each DCT coefficient ci y* in each 
block k to form quantized coefficient Up; 

(iii) de-quantizing the quantized coefficients up by 
multiplying them by qy to form qy up; 

(iV) subtracting the de-quantized coefficient q,yup 
from c ijk to compute a quantization error ep; 

(v) dividing ep by the DCT mask mp to obtain 
perceptual errors; 

(vi) pooling the perceptual errors of one frequency 
ij over all blocks k to obtain an entry in a percep- 
tual error matrix p y; 

(vii) repeating steps i-vi for each frequency y; 

(viii) adjusting the values qy up or down until each 

entry in the perceptual error matrix py is within 
a target range; and 

(ix) entropy coding said quantization matrix and 
the quantized DCT coefficients of said image. 

2. The method of transforming an image according to 
claim 1 further comprising the steps: 

(a) transmitting entropy coded image to a user of said 
digital image; 

(b) decoding said entropy coded image; 

(c) decoding said quantization matrix to derive said 
DCT transformation coefficients; and 

(d) applying an inverse Discrete Cosine Transform to 
derive said block of pixels. 

3. The method of transforming an image according to 
claim 2, wherein block mean luminance and component 
contrast are included in said display and perceptual 
parameters and are determined by the following expres- 
sion: 


tp=t*c oo/C66r 

where a t is a luminance-masking exponent having a 
value of about 0.65, ty is the un-adjusted threshold of 
luminance, Coo is an average of the DC terms of the 
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DCT coefficients Cijk of the image, and cook, is the DC 
coefficient of block k. 

4. The method of transforming an image according to 
claim 2, wherein said DCT mask is determined by the 

5 following expression: 

my*=Max [ty*, | Cijk I Wif Ujk X ~ wi A 

where Cy*is the DOT coefficient, ty*is a corresponding 

10 block mean luminance adjusted threshold, and wy is an 
exponent that lies between 0 and 1 and is about 0.7. 

5. The method of transforming an image according to 
claim 2, wherein said pooling of perceptual errors is 
determined by the following expression: 

15 

Pi} = (j \d iJk \^ J 

20 6. The computer involved in converting a block hav- 

ing an index k of pixels from an electronic image into a 
digital representation of said image, said computer com- 
prising: 

(a) means for applying a Discrete Cosine Transform 

25 (DCT) to transform said block of pixels into digital 

signals represented by DCT transformation coeffi- 
cient (c ijk); 

(b) means for selecting a DCT mask (my*) for each 
block of pixels based on parameters comprising 

30 said DCT transformation coefficients cm and dis- 
play and perceptual parameters; 

(c) means for selecting a quantization matrix (qy) 
comprising: 

(i) means for selecting an initial value of qy; 

35 (ii) means for quantizing the DCT coefficient cy*in 
each block k to form quantized coefficient uy*; 
(iii) means for de-quantizing the quantized coeffici- 
ents Cijk by multiplying them by qy to form qy 
u ijk ; 

40 (iv) means for subtracting the de-quantized coeffi- 
cient q ij Cijk from c y* to compute a quantization 
error ey*; 

(v) means for dividing ey*by the DCT mask my* to 
obtain perceptual errors; 

45 (vi) means for pooling the perceptual errors of one 
frequency //over all blocks k to obtain an entry in 
a perceptual error matrix py; 

(vii) means for adjusting the values q ,y up or down 
until each entry in the perceptual error matrix py 

50 is within a target range: and 

(d) means for entropy coding said quantization matrix 
and the quantized DCT coefficients of said image. 

7. The computer involved in converting an image 
according to claim 6 further comprising: 

55 (a) means for transmitting entropy coded image to a 

user of said digital image; 

(b) means for decoding said entropy coded image; 

(c) means for decoding said quantization matrix to 
derive said DCT transform coefficients; and 

60 (d) means for applying an inverse Discrete Cosine 

Transform to derive said block of pixels. 

8. The computer involved in converting an image 
according to claim 6, wherein block mean luminance 
and component contrast are included in said display and 

65 perceptual parameters and are determined by the fol- 
lowing expression embodied in the operation of said 
means for selecting a DCT mask: 
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Ujk — MfS^ook/'^oof 1 

where a / is a luminance-masking exponent having a 
value of about 0.65, t/y is the un-adjusted threshold of 
luminance, is an average of the DC terms of the 
DCT coefficients Cijk of the image, and Cook is the DC 
coefficient of block k. 

9. The computer network involved in converting an 
image according to claim 6, wherein said DCT mask is 
determined by the following expression embodied in 
said means for selecting a DCT mask: 

mijk~Max[tijk,\Cijk\ Wi h^~ Wi f\ 

m ijk, is the contrast-adjusted threshold, c ijk is the DCT 
coefficient, tyyjfcis the corresponding threshold of expres- 
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sion 2, and w/y is the exponent that lies between 0 and 1 
and typically has a value of 0.7. 

10 . The computer involved in converting an image 
according to claim 6, wherein said pooling of percep- 
tual errors is determined by the following expression 
embodied in said means for pooling: 



Where d ijk is an error in a particular frequency i, j, and 
block k, (3 S is a pooling exponent having a typical value 
of 4. 

* * * * * 
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